International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Open Access Journal

ISSN : 2394-2320 (Online)

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

Open Access Journal

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

ISSN : 2394-2320 (Online)

Call For Paper : Vol 11, Issue 03, March 2024

“Context-Based Diversification for Keyword Queries over XML Data”

Author : Snehal Ingole ¹ DR. S.S.Prabhune ²

Date of Publication :7th April 2016

Abstract: The problem of diversifying keyword search is firstly studied in IR community. Most of them perform diversification as a post-processing or re-ranking step of document retrieval based on the analysis of result set and/or the query logs. In IR, keyword search diversification is designed at the topic or document level. The ambiguity of keyword query makes it difficult to effectively answer keyword queries, especially for short and vague keyword queries. To address this challenging problem, in this paper we propose an approach that automatically diversifies XML keyword search based on its different contexts in the XML data. Given a short and vague keyword query and XML data to be searched, we first derive keyword search candidates of the query by a simple feature selection model. And then, we design an effective XML keyword search diversification model to measure the quality of each candidate. After that, two efficient algorithms are proposed to incrementally compute top-k qualified query candidates as the diversified search intentions. Two selection criteria are targeted: the k selected query candidates are most relevant to the given query while they have to cover maximal number of distinct results. At last, a comprehensive evaluation on real and synthetic data sets demonstrates the effectiveness of our proposed diversification model and the efficiency of our algorithms.

Reference :

1. Y. Chen, W. Wang, Z. Liu, and X. Lin, “Keyword search on structured and semi-structured data,” in SIGMODConference, 2009, pp. 1005–1010.
2. L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram, “Xrank: Ranked keyword search over xml documents,” in SIGMOD Conference, 2003, pp. 16–27. [3] C. Sun, C. Y. Chan, and A. K. Goenka, “Multiway slca-based keyword search in xml data,” in WWW, 2007, pp. 1043–1052
3. Y. Xu and Y. Papakonstantinou, “Efficient keyword search for smallest lcas in xml databases,” in SIGMOD Conference, 2005, pp. 537–538.
4. R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong, “Diversifying search results,” in WSDM, 2009, pp. 5–14.
5. F. Radlinski and S. T. Dumais, “Improving personalized w eb search using result diversification,” in SIGIR, 2006, pp. 691–692.
6. E. Demidova, P. Fankhauser, X. Zhou, and W. Nejdl, “ DivQ: diversification for keyword search over structured databases,” in SIGIR, 2010, pp. 331–338.
7. J. G. Carbonell and J. Goldstein, “The use of mmr, diversity-based reranking for reordering documents and producing summaries,” in SIGIR, 1998, pp. 335–336
8. H. Chen and D. R. Karger, “Less is more: probabilistic models for retrieving fewer relevant documents,” in SIGIR, 2006, pp. 429–436.
9. C. L. A. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Büttcher, and I. MacKinnon, “Novelty and diversity in information retrieval evaluation,” in SIGIR, 2008, pp. 659–666.
10. A. Angel and N. Koudas, “Efficient diversity-aware sear ch,” in SIGMOD Conference, 2011, pp. 781–792.
11. Z. Chen and T. Li, “Addressing diverse user preferences in sql-query-result navigation,” in SIGMOD Conference, 2007, pp. 641–652.
12. E. Vee, U. Srivastava, J. Shanmugasundaram, P. Bhat, and S. Amer-Yahia, “Efficient computation of diverse query results,” in ICDE, 2008, pp. 228–236.
13. B. L. 0002 and H. V. Jagadish, “Using trees to depict a forest,” PVLDB, vol. 2, no. 1, pp. 133–144, 2009.
14. Z. Liu, P. Sun, and Y. Chen, “Structured search result differentiation,” PVLDB, vol. 2, no. 1, pp. 313–324, 2009
15. H. Peng, F. Long, and C. H. Q. Ding, “Feature selection based on mutual information: Criteria of maxdependency, max-relevance, and min-redundancy,” IEEE Trans. PatternAnal. Mach. Intell., vol. 27, no. 8, pp. 1226–1238, 2005.
16. C. O. Sakar and O. Kursun, “A hybrid method for feature selection based on mutual information and canonical correlation analysis,” in ICPR, 2010, pp. 4360– 4363.
17. N. Sarkas, N. Bansal, G. Das, and N. Koudas, “Measure-driven keyword-query expansion,” PVLDB, vol. 2, no. 1, pp. 121–132, 2009.
18. N. Bansal, F. Chiang, N. Koudas, and F. W. Tompa, “Seekin g stable clusters in the blogosphere,” in VLDB, 2007, pp. 806–817
19. “http://dblp.uni-trier.de/xml/.”
20. “http://monetdb.cwi.nl/xml/.”
21. M. J. Welch, J. Cho, and C. Olston, “Search result divers ity for informational queries,” in WWW, 2011, pp. 237–246.
22. R. H. van Leuken, L. G. Pueyo, X. Olivares, and R. van Zwol, “Visual diversification of image search results,” in WWW, 2009, pp. 341–350.
23. Z. Liu, S. Natarajan, and Y. Chen, “Query expansion base d on clustered results,” PVLDB, vol. 4, no. 6, pp. 350–361, 2011.
24. S. Gollapudi and A. Sharma, “An axiomatic approach for result diversification,” in WWW, 2009, pp. 381–390.
25. J. Wang and J. Zhu, “Portfolio theory of information retrieval,” in SIGIR, 2009, pp. 115–122.

Recent Article

● Efficient Cluster based Communication with Plausibility Checks in VANET

● Survey of various Query Mapping Techniques from SQL to No SQL

● A Secure and Authentication Based Mechanism in Zone Routing Protocol

● Preparation of Plan and Analysis of a Multistoried Medical College

● Detecting Malicious Apps on OSN Face-book Wall

● A Design of Approximation Algorithm for Efficient DNA Mapping using Hadoop with GPU Acceleration

● “Context-Based Diversification for Keyword Queries over XML Data”

● Automatic Parking Reservation Application for Smartphone Using IR Sensors

● “A review study on secure web application development using PHP with Laravel Framework”

● Confidential Image Sharing Using Visual Secret Sharing Scheme

● Quadruped-Walker

● Recommenders System for Effective ICT Based Learning

● Resolve the Classification Problem over Encrypted data Using K-Nearest Neighbour

● Disease Inference System based on Health-Related Question Answer system

● Implementation Of Efficient Privacy Preserving Classification Techniques With Outsource Data

● Privacy Preserving Association Rules Mining In Horizontally Distributed Databases

● Crop Field Analyzer

● A Secure Parallel Network File System using Protocols

● New Approach for Parallel Graph Computation Using Partition Aware Engine

● Web Crawler: A Crawler for Efficiently Retrieving Relevant Data

● User Authentication using Digital Signature and Biometric Factor

● Security in Searching Shared and Encrypted Data in Multi Party Environment

● Enhancing Scalable Reverse Dictionary Using Text Rank

● Strengthening Authentication System Using Mindmetrics

● Stock Market Prediction and Analysis using Hadoop

● Fuzzy Authorization on Cloud Computing By Using Merging Technique

● Dual Cryptography Based Data Security in Cloud Computing

● A New Differential Evolutionary Algorithm

● An Alternative Approach to Resolve Load Balancing Problems in Cloud Computing

● Time to Change the World around You “Intellectual Controller

● Table Complexity Measurement of Database Systems

● Augmentation of Apriori Algorithm

● 3D Brain Tumor Detection

● Automatic Detection of Leukemia Using Blood Smear Sample

● Active Feature Description in Animals Footprint Identification

● Effective Interoperability between IPV6 Networks through Tunneling and Dual Stack Mechanism

● Iot – Based Information System for Emergency Medical Services

● Information Security by Embedding Of QR Code into Color Image

● Data Hiding in Encrypted H.264/AVC Video Streams by Code word Substitution

● Improved Performance Of Web Based Database Management For Telemedicine By Using Three Fold Approach Of Data Fragmentation,Websites Data Clustering And Data Allocation

● Collaborative Data Publishing Using Privacy Preserving Technique

● Cloud Gaming: A Green Future

● Optimizing SDN Performance for Small Networks by Enhancing Open Flow

● Secure Query Processing Over Encrypted Data via Location Based Service Provider

● An Efficient Cost-A ware Secure Routing (CASER) Protocol for Wireless Sensor Network

● Trustworthy URI: Enhancing the Data’s On the Web Reliable and Immutable

● Machine Learning Methods for Medical Diagnosis VIA Web Application

● Social Recommendation with Cross-Domain Transferable Knowledge

● Improving Security and QOS in Device-To-Device Communication Using Elliptic Curve DIFFIE Hellman Algorithm

● Cost-Effective Authentic and Anonymous Data Sharing with Forward Security

● Analysis of Request over Possible Clouds without Merged Duplicates

● Reservation Based Smart Parking System

● Webcam Based Remote Authentication via Biometrics over Insecure Channels Using Steganography

● Effect of Looping On the Lifetime Of A Multi-Sink Wireless Sensor Network Deployed For Healthcare Monitoring System

● Flow Mobility Modeling In VANETS